ResearchGate 


See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/350698000 
Agent-Based Modeling and the City: A Gallery of Applications 


Chapter - April 2021 


DOI: 10.1007/978-981-15-8983-6_46 


CITATIONS READS 
22 1,184 


4 authors, including: 


Andrew Crooks Ed Manley 
University at Buffalo, The State University of New York University of Leeds 


209 PUBLICATIONS 7,662 CITATIONS 107 PUBLICATIONS 3,221 CITATIONS 


SEE PROFILE SEE PROFILE 


All content following this page was uploaded by Andrew Crooks on 07 May 2021. 


The user has requested enhancement of the downloaded file. 


Chapter 46 R) 
Agent-Based Modeling and the City: get 
A Gallery of Applications 


Andrew Crooks, Alison Heppenstall, Nick Malleson, and Ed Manley 


Abstract Agent-based modeling is a powerful simulation technique that allows one 
to build artificial worlds and populate these worlds with individual agents. Each 
agent or actor has unique behaviors and rules which govern their interactions with 
each other and their environment. It is through these interactions that more macro- 
phenomena emerge: for example, how individual pedestrians lead to the emergence 
of crowds. Over the past two decades, with the growth of computational power 
and data, agent-based models have evolved into one of the main paradigms for urban 
modeling and for understanding the various processes which shape our cities. Agent- 
based models have been developed to explore a vast range of urban phenomena from 
that of micro-movement of pedestrians over seconds to that of urban growth over 
decades and many other issues in between. In this chapter, we introduce readers 
to agent-based modeling from simple abstract applications to those representing 
space utilizing geographical data not only for the creation of the artificial worlds but 
also for the validation and calibration of such models through a series of example 
applications. We will then discuss how big data, data mining, and machine learning 
techniques are advancing the field of agent-based modeling and demonstrate how 
such data and techniques can be leveraged into these models, giving us a new way 
to explore cities. 


46.1 Introduction 


The start of the twenty-first century marked a milestone in human history: for the 
first time more than half of the world’s population, approximately 3.9 billion people, 
lived in urban areas. This trend is expected to continue in the foreseeable future, 
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with 6.3 billion people living in cities by 2050 (United Nations 2014). Population 
growth will cause more urban land to be developed during the first 30 years of the 
twenty-first century than in all of human history (Angel et al. 2011). Less than five 
percent of the earth’s surface is urbanized and with the urban population predicted to 
grow to 5 billion by 2030, the urban footprint will still be less than 10% (Seto et al. 
2011). Combine this with the unprecedented urban expansion, especially in the form 
of megacities—cities with more than 10 million in population—which have grown 
from eight in the 1970s to 36 in 2016 and are expected to rise to 41 by 2030 as shown 
in Fig. 46.1, and society as a whole will be faced with unprecedented challenges and 
questions to be asked with respect to all aspects of city life. Will cities be sprawling or 
compact? How will cities adapt to climate change? How will new technologies such 
as autonomous cars, for example, affect our lives? These are challenging questions 
made more complicated by the fact that cities are excellent examples of complex 
systems, composed of people, places, flows, and activities (Batty 2013), all of which 
interact in a variety of different ways. 

An exact definition of a complex system is difficult to pin down, as it has a different 
meaning to different people (Thrift 1999). A simple definition is one whereby a 
small number of rules or laws, applied at a local level and among many entities, 
are capable of generating complex global phenomena such as collective behaviors, 
extensive spatial patterns, and hierarchies, in such a way that the actions of their 
parts do not simply sum to the activity of the whole, due to self-organization, nonlin- 
earities, feedbacks (both positive and negative), and path dependencies.! Cities are 
complex systems, composed of many parts, dynamic, and containing large numbers 
of discrete actors interacting within space and with other systems from nature and 
technology, and have a wide-ranging impact on the economy, public policy, national 
defense, social trends, public health, climate change, etc. As Wilson (2000) writes, 
understanding cities is “...one of the major scientific challenges of our time.” Human 
behavior cannot be understood or predicted in the same way as in the physical sciences 
such as physics or chemistry. The actions and interactions of the inhabitants of a city, 
for example, cannot be easily described in a physical-science theory such as that of 
Newton’s Laws of Motion. This notion is captured quite aptly by a quote by Nobel 
laureate Murray Gell-Mann: “Think how hard physics would be if particles could 
think.” In the remainder of this chapter, we will introduce agent-based modeling 
(Sect. 46.2) as it offers a way to explore the processes that lead to the patterns we see 
in cities from the bottom up, but also allows us to incorporate ideas from complex 
systems (e.g. feedbacks, path dependency, emergence) along with providing a gallery 
of applications of geographically explicit agent-based models. Next, we discuss how 
we can incorporate various decision-making processes within such models, and also 
how we can integrate this style of modeling with data, with a specific emphasis on 
geographical and social information (Sect. 46.3). This section also discusses how 


‘Readers wishing to know more about cities and complexity are referred to the works of Allen 
(1997), Wilson (2000), and Batty (2007). 
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Fig. 46.1 Global megacities in 2016 and estimated megacities by 2030 (data source: United Nations 


2016) 


agent-based modelers are utilizing machine learning within their models. Finally, in 
Sect. 46.4, we will provide a summary and discuss new opportunities with respect 


to agent-based modeling and the city. 
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46.2 What is Agent-Based Modeling? 


Over the past two decades, with the growth of computational power and data (which 
we will discuss in more detail in Sect. 46.3), agent-based models have evolved 
into one of the main modeling paradigms for urban systems and understanding the 
problems that today’s cities face (see: Benenson and Torrens 2004; Batty 2005; 
Crooks et al. 2019). In this section, we first give a general yet brief overview of 
agent-based modeling before discussing the various reasons to model (Sect. 46.2.1). 
We then discuss steps in building such models (Sect. 46.2.2) before turning our 
attention to geographically explicit agent-based modeling examples (Sect. 46.2.3) 
which demonstrate the types of problems such a style of modeling can explore. 

Agent-based modeling, as with other modeling techniques (e.g. spatial interac- 
tion models, microsimulation) is a way to take the complexities of the real world 
and, through abstraction, reductionism, and simplification, to focus on the important 
task at hand (Gilbert and Troitzsch 2005). The main difference between agent-based 
modeling and other styles of modeling is that the focus is on interactions of indi- 
vidual entities and their behaviors, and how more aggregate patterns emerge through 
such interactions (e.g. how individual cars can lead to the emergence of traffic jams). 
Broadly defined, an agent-based model can be considered as an artificial world inhab- 
ited by autonomous and heterogeneous agents, each with their set of goals and prefer- 
ences. It is through interactions with other agents that the agent makes decisions and 
decides what actions are to be carried out based on specific goals. These interactions 
lead to more aggregate patterns emerging as shown in Fig. 46.2. 

For example, if one were to build an agent-based model of a housing market, 
individual agents could be considered as households. Each household has to decide 
where to live and as with real households, each can have its own preferences for 
hosing style and neighborhood type, and each has its own income constraints. The 
interactions with other households in the form of buying and selling a house lead to the 
emergence of property markets (e.g. Geanakoplos et al. 2012). Or considering traffic 
congestion during the morning rush hour, individual agents could be considered as 
drivers of cars: each agent has to decide what time to leave home to go to work, and 
by driving on the road its interactions with other agents (i.e. cars) is what leads to 
traffic jams forming (e.g. Manley et al. 2014). 


46.2.1 Examples of Why to Model 


As with other modeling styles, within agent-based modeling, there are multiple 
reasons for why one should model, from understanding a certain phenomenon to 
predicting and forecasting (see Epstein 2008 for a discussion on the various reasons 
to model) and therefore agent-based models range from abstract thought experi- 
ments to more empirically applied applications. For example, Schelling’s (1971) 
model of segregation is not only a classic example of an abstract model, but it also 
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Fig. 46.2 Schematic of an agent-based model, showing how interactions between agents lead to 
emergent phenomena within an artificial world 


demonstrates how emergent phenomena (in this case segregation) can occur through 
individual preferences. Moreover, it demonstrates how macro-level segregation does 
not necessarily reflect micro-level preferences. For example, in Fig. 46.3, we show 
two types of agents, those who prefer football versus those who prefer baseball. In 
this simple example, based on notions from Schelling’s (1971) model, agents (i.e. 
individuals) want to be in locations (a cell on a 11 by 11 grid which acts as our arti- 
ficial world) where a certain percentage of their neighbors are similar to themselves 
(in this example 30%). 

Over time (T), agents move if their preference for their neighborhood compo- 
sition is not met. As one can see, from an initial randomly distributed population, 
segregated neighborhoods emerge due to agents interacting with other agents and 
taking actions (in this case moving) and to the resulting feedbacks and past locational 
choices of others. Also, the model demonstrates how the actions of one agent might 
affect others. For example, an agent may be satisfied in a certain location but another 
agent moving into the neighborhood might cause this agent to become dissatisfied 
and therefore cause it to move. By altering the agent’s preferences for certain neigh- 
borhood compositions (e.g. from 30 to 70% of similar neighbors), we can also see 
how individual preferences and interactions at the micro-level lead to more macro- 
level phenomena emerging as we show in Fig. 46.4; specifically in this example, we 
see how more segregated communities emerge as preferences are increased. 

What is interesting about this phenomenon is that often when we see segregated 
neighborhoods, the process and actions that led to this pattern have already occurred. 
However, through agent-based modeling, we can explore what processes or actions 
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Fig. 46.3 Example of segregation emerging over time as agents move to locations where their 
preferences are met (note smaller balls are dissatisfied agents) 


Fig. 46.4 Examples of how different preferences lead to different patterns of segregation 
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might have led to such patterns emerging in the first place, and thus devise potential 
interventions before it is too late. However, as noted above, agent-based models can 
also be empirically grounded. Take for example the work of Benenson et al. (2002), 
which explored how people’s preferences for certain neighborhoods and building 
types lead to distinct residential patterns emerging in Tel Aviv, Israel. While both have 
their own purpose, Schelling’s (1971) to explore basic behavior and that of Benenson 
et al. (2002) to explain residential choice based on empirical data and test various 
scenarios, both show that individual preferences for certain types of neighborhoods 
lead to distinct residential patterns emerging, which would be difficult to explain 
from just looking at aggregate data alone. It should however be noted that agent- 
based modeling is not just an academic exercise, but has been used by companies 
and organizations for a variety of decision-making purposes. These range from the 
potential impact of decimalization of the NASDAQ Stock Market (Darley and Outkin 
2007), to that of understanding store design, consumer markets, or hiring strategies 
for companies (see Bonabeau 2003). Readers of this chapter might also be surprised 
to know that they have probably seen agent-based models while at the cinema or 
watching TV as they are often used for massive crowd scenes in movies, replacing the 
need for a large cast of extras (see Massive 2019). Companies, especially engineering 
ones, are also utilizing agent-based models to study pedestrian (e.g. products such 
as Legion 2019 and STEPS 2019) or traffic dynamics (e.g. PTV Visum 2019 and 
Paramics 2019) in order to assess new designs for buildings or traffic measures before 
they are built or implemented. 


46.2.2 Steps in Building an Agent-Based Model 


When it comes to building an agent-based model, the process can be broadly viewed 
as having three steps. First, before we can get to the model itself, we need to identify 
the research question we are trying to solve with the model (e.g. reasons for traffic 
patterns), define the target of the model, know specifically what we are we trying 
to solve (e.g. traffic dynamics), and consider if there are any observations of the 
target we wish to include to provide parameters and initial conditions for the model 
(e.g. origin—destination data). We then need to make assumptions and design the 
model. Once the model has been designed and implemented (often in computer 
code), the second step is to run (execute) the model, which creates an artificial 
world. This is then populated with agents (e.g. cars) that are assigned attributes 
and rules (depending on the application or phenomena of interest). We then run the 
model until a certain condition is met or a specific time epoch is reached, and report 
and observe the results which are shown in Fig. 46.5a (while Fig. 46.5b shows a 
simple worked example of the segregation model discussed in Sect. 46.2.1). While 
this figure and the description given above are highly generalized and simple, in 
essence, one could make the argument that agent-based models are just rule-based 
systems, in the sense that they could be considered as just a series of if-then-else 
statements. For example, if the fire alarm goes off, then exit the building, else stay in 
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Fig. 46.5 Highly generalized flow of an agent-based model a and the corresponding flow of the 
basic segregation model b 


the building. However, the richness of agent-based modeling is that while the agents 
themselves might be highly specified and their rules of interactions are well-known, 
and it is not until the model is run that we can know the outcome, due to the variety 
of possible interactions of autonomous heterogeneous decision-making agents. In 
essence, like complex systems themselves, agent-based models are more than the 
sum of their parts. Once the model is run, the third step is to evaluate the model (e.g. 
verification, calibration, validation, sensitivity analysis). For further guidelines on 
designing, implementing, and evaluating agent-based models, readers are referred to 
Gilbert and Troitzsch (2005) and Crooks et al. (2019). 


46.2.3 Application Areas for Geographically Explicit 
Agent-Based Models 


Geographically explicit agent-based models (i.e., those utilizing geographical infor- 
mation which we will go into more detail about in Sect. 46.3) have been developed to 
explore a range of problems which society faces over a variety of spatial and temporal 
scales from the micro-movement of pedestrians over seconds (e.g. Torrens 2012) to 
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that of the macro-evolution of city systems over centuries (Pumain and Sanders 
2013). The flexibility that the agent-based modeling approach provides has allowed 
such models to be used in a diverse set of applications. These range from arche- 
ology (Axtell et al. 2002), agriculture (Hailegiorgis et al. 2018), basketball (Oldham 
and Crooks 2019), crime (Malleson et al. 2013), diseases (Perez and Dragicevic 
2009), disasters (Jumadi et al. 2018), invasive species (Anderson and Dragićević 
2018), to urban growth (Xie and Yang 2011), housing markets (Geanakoplos et al. 
2012), gentrification (Jackson et al. 2008), slum formation (Patel et al. 2018), and 
traffic (Manley and Cheng 2018). So, while agent-based modelers have been utilizing 
geographical data in their models, what has changed is the growth of data and ways of 
integrating such data within models (which will be discussed more in Sect. 46.3.2). 

Open-source agent-based modeling toolkits such as GAMA (Taillandier et al. 
2019), MASON (Luke et al. 2018), Repast (North et al. 2013), and NetLogo 
(Wilensky 1999) have evolved substantially over the past 20 years and many have 
built-in functionality to directly integrate data into models (e.g. raster and vector data 
structures), thus lowering the bar for creating geographically explicit models (for a 
review of these platforms and their applications readers are referred to Crooks et al. 
2019). For example, in Fig. 46.6, we show a selection of models created utilizing 
the MASON toolkit and its GeoMason extension for GIS integration that span both 
spatial and temporal scales. These include such things as the micro-movement of 
pedestrians over seconds to that of the macro-movement of migrants over years, 
and many things in between such as modeling traffic, responses to disasters, disease 
outbreaks, and urban growth (for access to these models see MASON 2019, and for 
equivalent geographically explicit models in NetLogo see https://www.abmgis.org/). 
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Fig. 46.6 Selection of GeoMason models across various spatial and temporal scales 
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In addition to these general-purpose open-source toolkits which allow for a range of 
urban phenomena to be simulated, where one could argue that the only constraint 
is that of the modeler’s imagination, there are others that are dedicated to specific 
domains such as the open-source transportation simulations (e.g. MATSim of Horni, 
Nagel, Axhausen 2016, POLARIS of Auld et al. 2016, or TRANSIMS 2019), which 
are being used to study a wide range of transportation issues (e.g. daily trips, route 
planning, evaluation of intelligent transportation systems) in multiple cities around 
the world. 


46.3 Integrating Data and Decision-Making 
into Agent-Based Models 


Apart from the individual entities within agent-based models interacting with each 
other, these entities are also interacting and are affected by the artificial world (or 
environment) which they inhabit; similar to how the world around us affects our 
lives. For example, take land-use change. Developers may buy agricultural land, 
convert the land to residential use, and then sell it to residents who then move into it 
(e.g. Magliocca et al. 2011). Agents can also perceive their environment and respond 
to it (e.g. changing climatic conditions may alter farming practices as discussed 
in Hailegiorgis, Crooks, Cioff-Revilla 2018). Initially, many agent-based models 
represented space rather abstractly as we showed with the Schelling (1971) model 
in Sect. 46.2.1. However, perhaps with the demonstration of the Sugarscape model 
by Epstein and Axtell (1996), which showed how the environment can affect agents’ 
wealth and survival, modelers started to realize that the artificial world that the agents 
inhabited could be stylized on geographical data. From earlier works such as those 
by Gimblett (2002) or Benenson and Torrens (2004) to current day work (e.g. Crooks 
et al. 2019), researchers have utilized data not only to represent the physical aspects 
of the artificial world (e.g. land cover, road networks) but also to help inform the 
social aspects (e.g. census data to help with knowing how many agents live in an 
area). Such data take the abstract representations of space and make it more grounded 
in real-world locations as we show in Fig. 46.7. 

Different data layers in the form of rasters (e.g. land-use and land-cover, elevation) 
and vector formats (e.g. census areas, road networks) can act as the environment for 
the artificial world in which our agents interact. For example, vector data about roads 
can be used for a traffic simulation in the sense of allowing agents to navigate from 
one location to another. Or census data can be used to create a specified number 
of agents for a given location with associated socio-economic characteristics (e.g. 
Burger et al. 2017). Raster data such as those from the national land-cover dataset 
(Wickham et al. 2014) can be used for initialization of an urban growth simulation 
as they provide details on urban and non-urban land extents which affect where 
cities can and cannot grow (see Crooks et al. 2019 for further details and examples 
of how one can use such data in models). Such social and physical data layers in 
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Fig. 46.7 Using geographic information as a foundation for artificial worlds 


Fig. 46.7 replace the abstract artificial world presented in Fig. 46.2 and ground the 
model to actual real-world locations, which can have an impact on individual agents’ 
interactions. Compare, for example, the abstract room in Fig. 46.8a which is used 
to test basic pedestrian movement to that of Fig. 46.8b which is based on actual 
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Fig. 46.8 Moving from an abstract room a to one where the artificial world is based on a real-world 
building floor plan b 
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CAD data of a real-world building. Here, actual walls, corridors, and exits constrain 
the agent’s movement. While we already have discussed in Sect. 46.2.3 application 
areas, where researchers have created geographically explicit agent-based models to 
explore a wide range of phenomena, in the remainder of this section, we first discuss 
how one can incorporate decision-making into agent-based models (Sect. 46.3.1), 
before turning our attention to how new forms of data are being used in such models, 
to help inform decision-making (Sect. 46.3.2) and how with such data researchers are 
utilizing machine learning methods for various phases (steps) within the agent-based 
modeling (Sect. 46.3.3). 


46.3.1 Incorporating Decision-Making into Agent-Based 
Models 


As noted in Sect. 46.2.2, agent-based models are essentially rule-based systems in 
the sense that an agent’s actions are programmed directly into them. Therefore, it is 
important to consider how we go about choosing these rules. However, as discussed 
in Sect. 46.1, modeling human behavior is not as simple as it sounds. This is because 
humans do not just make random decisions, but base their actions upon their knowl- 
edge and their abilities. In addition, it might be nice to think that human behavior is 
rational, but this is not always the case. Decisions can be based on emotions, such 
as self-interest, happiness, anger, or fear (see Izard 2007). In addition, emotions can 
influence one’s decision-making by altering perceptions about the environment and 
future evaluations (Loewenstein and Lerner 2003). The question therefore is: how 
do we model human behavior? This is where agent-based models excel over other 
modeling approaches (as discussed in Sect. 46.2). Agent-based modeling allows us to 
focus on individuals or groups of individuals and give them diverse knowledge and 
abilities, which is not possible in other modeling methodologies. As such, agent- 
based models act as a testing ground for a variety of theoretical assumptions and 
concepts about human behavior (Stanilov 2012) within the safe environment of a 
computer simulation. 

Broadly speaking, there are three main approaches to capturing such decision- 
making processes within agent-based models (Kennedy 2012). The first is a math- 
ematical approach such as the use of ad hoc direct and custom coding of behaviors 
within the simulation, such as using random number generators to select a prede- 
fined possible choice (e.g. to buy or sell; Gode and Sunder 1993). But, people are 
not random, which has led researchers to develop other methods such as directly 
incorporating threshold-based rules; that is, when an environment parameter passes 
a certain threshold a specific agent behavior will result (e.g. move to a new loca- 
tion when the neighborhood composition reaches a certain percentage) as in the 
Schelling (1971) example introduced in Sect. 46.2.1. One could argue that these 
modeling approaches are appropriate when behavior can be well-specified. The 
second approach to modeling human behavior within agent-based models uses 
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conceptual cognitive frameworks. Within such models, instead of using thresh- 
olds, more abstract concepts such as beliefs, desires, and intentions (BDI; Rao and 
Georgeff 1991) or physical, emotional, cognitive, and social factors (PECS; Schmidt 
2002) are given to individual agents. Both the BDI and PECS frameworks have been 
successively applied to modeling human behavior in a number of applications, such 
as what drives people to crime (see Brantingham et al. 2005 and Malleson et al. 2010, 
respectively). 

These conceptual cognitive frameworks and mathematical approaches for repre- 
senting behavior, like agent-based models more generally, can both be considered 
as rule-based systems and are often applied to tens to millions of agents. The 
third approach, that of cognitive architectures, (e.g. Soar (Laird 2012) and ACT- 
R (Anderson and Lebiere 1998)) focuses on abstract or theoretical cognition of one 
agent at a time with a strong emphasis on artificial intelligence. This approach is 
rarely used to model more than a small number of agents, which makes their utility 
for modeling challenges faced by cities rather limited. However, while there are 
multiple ways of representing decision-making within agent-based models, why a 
modeler chooses one over the other is rarely discussed (Schliiter et al. 2017) or why 
a certain theory was chosen (if at all) to build upon (Groeneveld et al. 2017). Readers 
wishing to know more about decision-making within agent-based models are referred 
to Balke and Gilbert (2014) and to learn how such models can be used in a policy 
context see Calder et al. (2018). 


46.3.2 The Growth of Data and Its Utilization Within 
Agent-Based Models 


Coinciding with the ease of incorporating data into agent-based models (as discussed 
in Sects. 46.2.3) is the growth and availability of digital data (i.e. big data) for urban 
areas, many of which have an explicit or implicit geographic component (Stefanidis 
et al. 2013). Such data range from more traditional types such as census data, or 
remotely sensed imagery or in situ sensing devices (e.g. weather stations and air- 
pollution monitoring systems) to data from mobile sensors such as smartphones, 
GPS devices attached to taxis, or social media. This rise in data in a variety of 
shapes and forms coupled with increased computational resources has led to the rise 
of urban analytics. There are several definitions for urban analytics: for example, 
Singleton et al. (2017) defines it as a “multidisciplinary area of research concerned 
with using new and emerging forms of data, alongside computational and statistical 
techniques to study cities,’ while Batty (2019) places urban analytics in the wider 
scope of analytics more generally, stating the “term analytics implies a set of methods 
that can be used to explore, understand and predict properties and features of any 
system, in our case of cities.” What is common between the definitions is utilizing 
data and computational techniques to explore cities. If we first turn to data, we are 
not only referring to traditional datasets such as census and infrastructure (e.g. roads) 
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traditionally collected and distributed by governmental organizations and industry 
but also to volunteered geographic information (e.g. OpenStreetMap) and social 
media, Internet of things (IoT), and cell phones, which are giving us new ways to 
explore the urban environment (Batty et al. 2012; Crooks et al. 2015b). 

By bringing and analyzing these data together, we can begin to understand the 
wider patterns of cities. For example, smart-city data are founded at the individual 
level and through the analysis of travel cards can tell us how many people commute 
into a city every day (e.g. Zhong et al. 2015) and hint at the purpose of trips when 
combined with land-use information and social-media check-ins (Yang et al. 201 9b). 
Dockless-bike data can provide information on urban flows and impacts of new 
infrastructure (e.g. Yang et al. 2019a) Similarly, cell-phone data can show origin- 
destination pairs for urban mobility (e.g. Louail et al. 2015) or patterns of movement 
and interactions (e.g. Malleson et al. 2018; Manley and Dennett 2019). What such 
data cannot tell us explicitly is the purpose of one’s trip or their experience of the city 
while one is there. Bringing in data about the individual (social data) from multiple 
sources (e.g. Twitter, Facebook) might help complete the picture but still gives us 
only patterns and not necessarily the processes and the underlying motivations that 
led to the patterns emerging. 

Identifying how and when these patterns will emerge is extremely difficult. Take 
for example congestion: it arises as a result of individual mobility decisions based 
on factors such as life stage, accessibility to workplace, shops, or other facilities 
which are constantly changing. Congestion can build locally at pinch points, placing 
sections of the city’s transportation networks under severe strain. There is some irony 
that while we inhabit a data-rich world, without modeling it is extremely challenging 
to understand how the combination of physical environment and social dynamics 
contributes to how our cities function and grow. Data alone will not solve all the 
problems cities face, especially when using data from the past to look at the future. 
For example, with respect to financial or housing markets, we might have data on the 
stock market from 2010 to 2019 but this does not capture the 2007-2008 financial 
crisis. What happens if there is a structural change or some sort of evolution of the 
system or something happens outside of these bounds? Data capture only what they 
see, not necessarily extreme market events. Or to quote Heraclitus: “No man ever 
steps in the same river twice, for it’s not the same river and he’s not the same man.” 
This is one of the motivations for modeling, specifically agent-based models. We can 
explore such issues and pose what-if scenarios based on individuals making their 
own decisions. For example, what would be the implications of imposing congestion 
charging, in terms of improvements to both congestion and people’s activities (e.g. 
Zheng et al. 2012)? 

If we refer back to Fig. 46.7, we can utilize such data to inform our models, act 
as inputs to a model, or validate model outcomes. For example, there are numerous 
applications that are utilizing OpenStreetMap data to act as the foundation of their 
artificial worlds. These range from assessing route choice for humanitarian support 
after an earthquake (Crooks and Wise 2013), or utilizing building and infrastructure 
information during disease outbreaks (Crooks and Hailegiorgis 2014) to vehicle 
routing over a network (Horni et al. 2016) or as a basis for evacuation-route choice 
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(Goetz and Zipf 2012). If we turn our attention to pedestrian movement, which is 
of paramount importance if we wish to design more walkable cities, new sensor 
technology such as GPS has been used to test walking behaviors (Torrens et al. 
2012), while others have utilized CCTV to calibrate how people move through small 
areas (Crooks et al. 2015a) or calibrate crowd densities (Batty et al. 2003). Crols 
and Malleson (2019), on the other hand, used footfall data collected via sensors to 
validate their pedestrian model of daily mobility in the town center of Otley, West 
Yorkshire in order to better understand how the town center is being used by its 
inhabitants. Similarly, Griibel et al. (2019) used footfall data to validate their model 
of pedestrian flows through Westminster in London. 

New sources of data are also shedding light into how people navigate around the 
city; for example, Manley et al. (2015) found in analyzing GPS data from London 
minicabs that the shortest path models often used in transportation studies poorly 
predicted the actual behavior of minicab drivers; but through an agent-based model 
they showed how drivers used specific urban features (i.e., “anchor points”) with 
respect to navigating around the city. Moving beyond just geographic data, others 
are using natural language processing (NLP) to mine textual data to inform agent 
decision-making (Runck et al. 2019). In another example, Wise (2014) developed an 
agent-based model to explore a wildfire event and subsequent evacuation in Colorado 
Springs over the space of a week in 2012. Specifically, Wise mined social media, 
in this case, Twitter, to derive the moods of people in the area and fed this into an 
evacuation model. For example, if one of the agents (i.e. a Colorado Springs resident) 
knew that the fire was nearby, and this information was passed along his or her social 
network to other agents who then decided whether to evacuate or not. This decision 
to evacuate or not also led to congestion, which was validated based on data that 
were harvested from the crowd and news outlets. What the above examples show is 
that new sources of data can be utilized in many aspects of agent-based modeling, 
especially those related to urban applications over a variety of spatial and temporal 
scales. 


46.3.3 The Potential of Machine Learning and Agent-Based 
Modeling 


While there has been a tremendous growth over the past decade in machine learning, 
a subfield of artificial intelligence, which is partly due to increases in computa- 
tional power and the availability of data and is leading to new areas of research 
within urban analytics, and terms such as geographic data science are appearing (see 
Singleton and Arribas-Bel 2019). By using machine learning techniques (such as 
genetic algorithms, artificial neural networks, Bayesian classifiers, decision trees, or 
reinforcement learning) and data mining (i.e. finding patterns in the data), researchers 
have been exploring many aspects of city life such as the identification of slums via 
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decision trees (Mahabir et al. 2018) and using natural language processing to find 
meanings of place (Jenkins et al. 2016). 

However, while machine learning and data mining have seen a large growth in 
urban analytics, there has only been limited uptake of these methods in agent-based 
models, even though as Rand (2006) notes they are similar in the sense that both can 
be considered as rule-based systems (as we discussed in Sect. 46.2.2), and as both 
need to be initialized with a specific set of parameters. Both need to be run, and while 
in agent-based models, we observe the dynamics, in machine learning, we observe 
the outputs of the machine learning process (such as numbers, rules, or categories), 
and conclude when the stopping conditions are met (Rand 2006).” For example, in 
an agent-based model, this might be when all agents are happy, while in machine 
learning, it could be when the algorithm completes its processing (e.g. the value of 
the objective function cannot be further improved). 

As noted in Sect. 46.2.2, agent-based modeling has broadly three major steps: 
the design of the model, the execution of the model, and evaluation of the model. 
Machine learning techniques have been applied to all three of these phases (see 
Abdulkareem et al. 2019). For example, in the first phase, the designing of the 
model, machine learning has been used to derive parameter values for agent-based 
models such as in cases of human mobility and obesity (e.g. Kavak 2007; Padilla et al. 
2016). Machine learning has also been used during the running of the model, often 
for agents to learn from past experiences and make more informed decisions via rein- 
forcement learning or genetic algorithms or random forests (e.g. Ramchandani et al. 
2017; Rand 2006; Wolpert et al. 1999). Zhang et al. (2018) used neural networks for 
traffic prediction under various traffic configurations. In another example, Abdulka- 
reem et al. (2019) used Bayesian networks and survey data to explore the spread of 
cholera in Kumasi, Ghana. Specifically, they used Bayesian networks with respect 
to improving risk perception and decision-making about where to get water during 
a cholera outbreak. Others have used reinforcement learning with respect to retire- 
ment planning (Ramchandani et al. 2017) or Bayesian networks to infer agents’ 
locational choice and how this affects land-use change (Kocabas and Dragicevic 
2013). Bone and Dragicevic (2010) used reinforcement learning to achieve optimal 
forest harvesting strategies. With respect to using machine learning algorithms to 
analyze model outputs (i.e. Step 3), Heppenstall et al. (2007) used a genetic algo- 
rithm to validate model outcomes of an agent-based model which simulates the retail 
gasoline market. 

The examples above are just a few agent-based models utilizing machine learning 
and are intended to show the reader that researchers are exploring the use of such tech- 
niques in various aspects of the agent-based modeling process. However, unlike in the 
data science community, the use of machine learning is rather limited. Perhaps, this is 
because in the data science community packages exist (such as those implemented in 
Python or R) for machine leaning, but this is not the case for agent-based modeling. 
While agent-based toolkits exist, modelers still need to design and implement their 


>For a greater discussion on the similarities between agent-based modeling and machine learning, 
readers are referred to Rand (2006). 
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own models, which in itselfis a time-consuming task. Also, agent-based models focus 
on individual behavior, and to fully utilize machine learning one needs training data 
which are often not available (due to ethical implications, privacy concerns, etc.) 
at the level of detail for agent-based models (e.g. Runck et al. 2019; Weinberger 
2011). We do not have space to delve deeper into why there has only been limited 
uptake of machine learning within agent-based models, but we envisage that with the 
growth of data, more agent-based modelers will utilize machine learning, especially 
as there are increasing calls to incorporate empirical data into models (e.g. Janssen 
and Ostrom 2006; Robinson et al. 2007) along with efforts to validate such models. 
For example, there might be abundant fine-resolution trajectory data about people’s 
movement in cities which can be used to validate movement models and thus test 
ideas and theories of what motivates such patterns to emerge. 


46.4 Summary and Outlook 


As the world is increasingly becoming more densely urbanized, it is becoming more 
important to understand each city as a complex system whose whole is more than the 
sum of its parts. Without such understanding, it will be difficult to grapple with future 
societal challenges such as climate change. Cities are composed of many individuals 
whose interactions and behaviors lead to many issues emerging (Sect. 46.1). In this 
chapter, we have introduced agent-based modeling (Sect. 46.2) which allows one to 
model social systems from the bottom up. The focus of such models is the creation 
of artificial worlds in which individuals are given unique behaviors and rules and 
interact with each other and their environment. It is through such interactions that 
more macro-patterns emerge: for example, how individuals form crowds, or people 
going to and from work result in traffic jams, or people buying and selling homes 
lead to property markets emerging. By integrating geographic information into such 
models, we can turn abstract artificial worlds to those that mimic real-world locations 
(Sect. 46.3). 

We also discussed how agent-based modeling has seen a large uptake over the 
past 20 years, spurred by the growth and availability of data (Sect. 46.3.2), which 
is providing many application domains for study. Such data when mined not only 
provide new ways to explore how people perceive and use the space around them, 
but also through machine learning methods can be integrated into the various aspects 
of agent-based modeling, from model parameterization to validation and calibration 
(Sect. 46.3.3). However, this is still an area which is evolving and there is still 
a significant amount of research to be done. New sources of data can potentially 
be mined to provide information pertaining to who, what, when, where, and why 
people do what they do. However, as Robert Axtell notes “...there is a large research 
program to be done over the next 20 years, or even 100 years, for building good high- 
fidelity models of human behavior and interactions” (cited by Weinberger 2011). 
Potentially, machine learning methods could help with, this especially with respect 
to improving decision-making within agent-based models. 
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Moreover, readers might have noted that a gallery of applications was discussed 
in this chapter, but there were very few attempts to integrate or couple various urban 
processes together, which was often the case with more traditional styles of land-use 
transportation interaction (LUTI) models (see Wise et al. 2017 for such a discussion). 
Perhaps, this is because agent-based models are being applied on a variety of spatial 
and temporal scales depending on the question at hand. For example, rush-hour 
traffic or various longer-term processes such as urban growth make it difficult to 
resolve temporal clocks or computational issues when scaling models to larger areas 
or greater numbers of agents, etc. However, the argument could be made that we 
are still in the initial stages of understanding cities from the bottom up, and the 
focus until now has been on specific problems but not on the city as a whole system. 
There is some justification for this based upon Simon’s (1996) concept of the near- 
decomposability of systems, in which parts of a system interact among themselves in 
clusters or subgraphs, with interactions among subsystems being relatively weaker or 
fewer but not negligible, and therefore in the short term, one can study such systems 
(or problems) in isolation. 

Looking ahead, as we noted above, today we are in a data-rich world and we 
discussed how one can utilize such data for model initialization, the parameterization 
of agents’ attributes, or for the validation of model outcomes. However, as agent- 
based models are often used to simulate the behavior of complex systems, these 
systems often diverge rapidly from initial starting conditions. One way to prevent a 
simulation from diverging from reality would be to occasionally incorporate more 
up-to-date data and adjust the model accordingly. Data, especially streaming data 
produced through near-real-time observational datasets (e.g. social media or vehicle 
routing counters) could be utilized in such a case as shown in Fig. 46.9. 

This process is known as dynamic data assimilation. There is a range of techniques 
that come under the banner of data assimilation that are designed for exactly this 
purpose. However, they have largely evolved from fields such as meteorology (i.e. to 
incorporate up-to-date environmental data into weather forecasts) and only recently 
have they started to be applied to agent-based modeling (e.g. Malleson et al. 2017; 
Rai and Hu 2013; Ward et al. 2016). The marriage of data assimilation methods 
and agent-based models could be transformative for the ways that some systems, for 
example, smart cities, are modeled. In addition to this, with new sources of big data 
and methods from machine learning and the growth of computational resources, we 
are perhaps nearing a point where we can explore and model cities from the bottom 
up at resolutions and scales that have not yet been possible. 
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Fig. 46.9 Dynamic data assimilation and agent-based modeling 
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